means for accessing a database structure providing a plurality of different subject 
matter categories, the database containing a classified vocabulary including a plurality of 
terms in each of\he different subject matter categories with each term being classified in 
accordance v^ith thessubject matter category structure of the database; 

means for receiving in computer-readable form a text document to be classified; 
^\ processor means qperable to compare terms appearing in the text document with the 

terms in the classified vocaB^lary and to determine from the comparison the category for the 
document; and 

means for supplying a si^al carrying data representing the text document and data 

associa ting the text document with^he determined category. 

4. i^mended) The apparatus according to claim 1, wherein the processor means is 
operable to de^^rmine the category for the document by determining from the comparison the 
^ *^ category or categis^ies of terms in the document, assigning weightings to the determined 

categories for the tenns, and assigning the document being classified to the category having 
the highest weig hting. 
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7. i^mended) The apparatus according to claim 4, wherein the processor means is 
operable, for eats;h term in the classified vocabulary and in the text document, to share a 
predetermined wei^ting factor between each category associated with the term. 
^^^3 8. (Amende)^ The apparatus according to claim 1, wherein the accessing means is 

arranged to access a pl^^ivj^ity of collocations also forming part of the database, each 
collocation being assoctat^cKAvith a specific different one on the subject matter categories and 
each collocation including a plurality of terms exemplifying the associated category. 



(Amended) A computer processing apparatus for classifying a document, 

comprising: 



means for accessing a database having a database structure providing a plurality of 
different sixbject matter categories, the database containing a classified vocabulary including 
a plurality of terms in each of the different subject matter categories with each term being 
classified in accordance with the subject matter category structure of the database and the 
database also contaVing a plurality of collocations each collocation being associated with a 
specific different one W the subject matter categories and each collocation including a 
plurality of terms exemplifying the associated category; 

means for receivingSin computer-readable form a text document to be classified; 
processor means operable to compare terms appearing in the text document with the 
collocations to determine the collocation having the most terms in common with the 
document, and to allocate the category of the determined collocation to the document; and 
means for supplying a signal carrying data representing the text document and data 
associating the text document with the determined category. 

19. (Amended) The apparatus according to claim 7, wherein the accessing means is 
arranged t\access the collocations from store means separate from the remainder of the 
database. 

20. (>^^nded) The apparatus according to claim 1, further comprising store means 
configured to sforeSjie database. 

. (Amended) The apparatus according to claim 1, further comprising store means 
^ storing the database. 

22. (Amended) The apparatus according to claim 1, wherein the database structure 
provides said pluraPky of subject matter categories as a tree structure including a plurality of 
main subject matter areas each divided into two or more subject matter areas. 
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23. (Amended) The apparatus according to claim 1, wherein the database structure 
provides said pluraHty of subject matter categories such that each category is defined by a 
subject maker area and a species or genus. 

24. O^mended) The apparatus according to claim 23, wherein the database provides 
said plurality of subject matter categories such that the species or geni are people, places, 
organizations, products and technology. 

25. (Amended) The apparatus according to claim 23, wherein the database structure 
provides said plurality of subject matter categories such that the species or genus are the 
same for each subject matter area. 

26. (Amended) Tlhe apparatus according to claim 1, wherein the database provides 
categories in each of the following subject matter areas: the universe, the earth, the 
environment, natural historA humanity, recreation, society, the mind and human history. 

27. (Amended) The apparatus according to claim 1, wherein the database structure is 
such that, for a given meaning, a term is associated with only one category and different 
meanings of the same term are associated with different categories. 

28. (Amended) The apparamxs according to claim 1, wherein the supplying means 
comprises means for storing a signal supplied by the supplying means on a computer 
readable medium. \ 

29. (Amended) The apparatus according to claim 1, wherein the supplying means 
comprises means for forwarding a signal Vupplied by the supplying means to another 
processing apparatus. \ 

30. (Amended) The apparatus accoraing to claim 1, wherein the supplying means 
comprises means for displaying the information to a user. 



3 1 A (Amended) In a computer processing apparatus having means for accessing a 
database having a database structure providing a plurality of different subject matter 
categories, the ^atabase containing a classified vocabulary including a plurality of terms in 
each of the different subject matter categories with each term being classified in accordance 
with the subject manner category structure of the database and means for receiving in 
computer-readable forn} a text document to be classified, a method of classifying documents 
comprising: 

comparing terms ap^aring in the text document with the terms in the classified 
vocabulary; 

determining from the conitoarison the category for the document; and 
supplying a signal carrying oata representing the text document and data associating 
the text document with the determining^ category. 



3V (Amended) The method according to claim 31, further comprising determining 
the categorySfor the document by determining from the comparison the category or categories 
of the terms in me document, assigning weightings to the determined categories for the terms, 
and assigning the oocument being classified to the category having the highest weighting. 

35. (Amende)^ The method according to claim 34, further comprising assigning 
weighting by, for each tVm in the classified vocabulary and in the text document, sharing a 
predetermined weighting fiu^tor between each category associated with the term. 

36. (Amended) llie method according to claim 31, further comprising accessing a 
plurality of collocations also^&miing part of the database, each collocation being associated 
with a specific different one o£j(^ subject matter categories and each collocation including a 
plurality of terms exemplifying the associated category. 
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(by S9. (Amended) In a computer processing apparatus having means for accessing a 
database ftaving a database structure providing a pluraUty of different subject matter 
categories, t\e database containing a classified vocabulary including a plurality of terms in 
each of the different subject matter categories with each term being classified in accordance 
with the subject n^atter category structure of the database and the database also containing a 
plurality of collocatitons each collocation being associated with a specific different one of the 
subject matter categories and each collocation including a plurality of terms exemplifying the 
associated category and having means for receiving in computer-readable form a text 
document to be classified, a method of classifying documents comprising: 

comparing terms appearing the text document with the collocations to determine the 
collocation having the most terms m common with the document; 

allocating the category of the determined collocation to the document; and 
supplying a signal carrying data l^presenting the text document and data associating 
the text document with the determined category. 

' 43. (Amended) The method according to claim 36, further comprising accessing the 
collocations fromStore means separate from the remainder of the database. 



5l9. (Amended) The method according to claim 31, further comprising carrying out 
the supplying by storing a signal on a computer-readable medium. 

5 1 . (AVended) The method according to claim 3 1 , further comprising carrying out 
the supplying by^rwarding a signal to another processing apparatus. 

52. (Amended) The method according to claim 31, further comprising displaying the 
information to a user. 

53. (Amended) AMatabase for use with an apparatus in accordance with claim 1, the 
database having a database structure providing a plurality of different subject matter 



categories, theMatabase containing a classified vocabulary including terms in each of the 
different subject rkatter categories with each term being classified in accordance with the 
subject matter categ^y structure of the database. 



56. (Amended) A database for use with an apparatus in accordance with claim 12, the 
database haWng a database structure providing a plurality of different subject matter 
categories, the oatabase containing a classified vocabulary including a plurality of terms in 
each of the different subject matter categories with each term being classified in accordance 



V with the subject mattencategory structure of the database and the database also containing a 



plurality of collocations each collocation being associated with a specific different one of the 
subject matter categories and e^ch collocation including a plurality of terms exemplifying the 
associate category. 



5u5 hS] 



65v (Amended) An apparatus for classifying electronic documents, comprising: 
storage means storing a classification scheme having a plurality of collocations each 
collocation being, associated with a respective different subject matter area and containing a 
set of terms which exemplify that subject matter area; 

means for compearing terms used in a document to be classified with the terms in said 
collocations; 

means for allocating Yhe document being classified to the one of said collocations 
which said comparing means io^ntifies as having the most number of terms in common with 
the document being classified; 

means for associating with thdsxiocument being classified a code representing the 
subject matter area of the allocation collV:ation; and 
means for storing the document together with the associated code. 

Please add new Claims 79-80 as follows: 
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C) C)i^^^ l^^* (New) A computer processing apparatus for classifying documents, the-apparatus 
comprismg: 

a da<^abase having a database structure defining a classification scheme for terms, 
Uhe classification scheme having subject matter data defining main and 
subsidiary subjeM matter domains into which terms can be classified and genera data 
defining a predetermined number of genera to which terms can be allocated, the classification 
scheme being such thai a term can be allocated to more than one subject matter domain but to 
only one genus so that eJich specific combination of subsidiary subject matter domain and 
genus defines a unique catJtgory, 

the database also havVig classified vocabulary comprising a set of terms classified in 
accordance with the classificatrbn scheme such that each term is associated with category 
h \' I y data identifying the corresponding category, 

\l the database also including a classification scheme data set which includes a 

respective different classification scheme data set item associated with each category, 

each classification schenre data set item comprising a collocation including a 
list of terms that may be used to describe tbe function, appearance or relationship with other 
objects of classified terms in that category o\that may be used in relation to terms in that 
category; 

a receiver operable to receive in compute\-readable form a text document to be 
classified; 

a processor configured to compare terms in tl^e text document with terms in at least 
one of the classified vocabulary and the collocations toxjetermine a category for the text 
document; and 



A1/ 



-8- 



\ a signal supplier configured to supply a signal carrying data representing the text 
document and data associating the text document with the determined category data. 
SO.^fl^ew) A method of classifying documents, comprising: 
providmg a classification scheme having subject matter data defining main and 
subsidiary subjecconatter domains into which terms can be classified and genera data 
defining a predeterrmied number of genera to which terms can be allocated, the classification 
scheme being such thatSa term can be allocated to more than one subject matter domain but to 
only one genus so that eaqi specific combination of subsidiary subject matter domain and 
genus defines a unique category; 

providing a classified vocabulary comprising a set of terms classified in accordance 
with the classification scheme suoh that each term in the classified vocabulary is associated 
with category data identifying the corresponding category; 

providing a classification scheme data set which includes a respective different 
classification scheme data set item assocJated with each category with each classification 
scheme data set item comprising a coUocatron including a list of terms that may be used to 
describe the function, appearance or relationsiaip with other objects of classified terms in that 
category or that may be used in relation to termsan that category; 

receiving data representing a text document to be classified; and 
comparing terms in the text document with tebns in at least one of the classified 
vocabulary and the collocations to determine a categor\ for the text document. 



